15 research outputs found

    Color Name Applications in Computer Vision

    Get PDF
    In Computer Vision, the association of names to colors is one of the fundamental problems in the field of image understanding. There are numerous computational applications (e.g. image retrieval, visual tracking, person identification, human-machine interaction, etc.) that require pixels to be labelled according to the color perceived by the user. This is relatively easy for focal colors under canonical illuminants, where the agreement is high, but becomes increasingly difficult as perceptions move away from these conditions. For these difficult cases, the traditional solution tends to be a collection of "ad-hoc" strategies, however, new approaches that combine knowledge from anthropology, linguistics, visual perception and machine learning have offered promising results. Specifically, deep neural networks appear to possess all the required building blocks to offer a color naming solution "in the wild". This article reviews the current state of knowledge and discusses open challenges with a multidisciplinary (and non-specialized) readership in mind

    Feedback and surround modulated boundary detection

    Get PDF
    Altres ajuts: CERCA Programme/Generalitat de CatalunyaEdges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The "classical approach" assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of receptive field surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal fromV1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on three benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods

    Colour constancy beyond the classical receptive field

    Get PDF
    The problem of removing illuminant variations to preserve the colours of objects (colour constancy) has already been solved by the human brain using mechanisms that rely largely on centre-surround computations of local contrast. In this paper we adopt some of these biological solutions described by long known physiological findings into a simple, fully automatic, functional model (termed Adaptive Surround Modulation or ASM). In ASM, the size of a visual neuron's receptive field (RF) as well as the relationship with its surround varies according to the local contrast within the stimulus, which in turn determines the nature of the centre-surround normalisation of cortical neurons higher up in the processing chain. We modelled colour constancy by means of two overlapping asymmetric Gaussian kernels whose sizes are adapted based on the contrast of the surround pixels, resembling the change of RF size. We simulated the contrast-dependent surround modulation by weighting the contribution of each Gaussian according to the centre-surround contrast. In the end, we obtained an estimation of the illuminant from the set of the most activated RFs' outputs. Our results on three single-illuminant and one multi-illuminant benchmark datasets show that ASM is highly competitive against the state-of-the-art and it even outperforms learning-based algorithms in one case. Moreover, the robustness of our model is more tangible if we consider that our results were obtained using the same parameters for all datasets, that is, mimicking how the human visual system operates. These results suggest a dynamical adaptation mechanisms contribute to achieving higher accuracy in computational colour constancy

    Which tone-mapping operator is the best? A comparative study of perceptual quality

    Get PDF
    Altres ajuts: CERCA Programme/Generalitat de CatalunyaPublicat sota la llic猫ncia Open Access Publishing Agreement, espec铆fica d'Optica Publishing Group https://opg.optica.org/submit/review/pdf/CopyrightTransferOpenAccessAgreement-2022-06-27.pdfTone-mapping operators (TMOs) are designed to generate perceptually similar low-dynamic-range images from high-dynamic-range ones. We studied the performance of 15 TMOs in two psychophysical experiments where observers compared the digitally generated tone-mapped images to their corresponding physical scenes. All experiments were performed in a controlled environment, and the setups were designed to emphasize different image properties: in the first experiment we evaluated the local relationships among intensity levels, and in the second one we evaluated global visual appearance among physical scenes and tone-mapped images, which were presented side by side. We ranked the TMOs according to how well they reproduced the results obtained in the physical scene. Our results show that ranking position clearly depends on the adopted evaluation criteria, which implies that, in general, these tone-mapping algorithms consider either local or global image attributes but rarely both. Regarding the question of which TMO is the best, KimKautz ["Consistent tone reproduction," in Proceedings of Computer Graphics and Imaging (2008)] and Krawczyk ["Lightness perception in tone reproduction for high dynamic range images," in Proceedings of Eurographics (2005), p. 3] obtained the better results across the different experiments. We conclude that more thorough and standardized evaluation criteria are needed to study all the characteristics of TMOs, as there is ample room for improvement in future developments

    Limitations of visual gamma corrections in LCD displays

    Get PDF
    A method for estimating the non-linear gamma transfer function of liquid-crystal displays (LCDs) without the need of a photometric measurement device was described by Xiao et al. (2011) [1]. It relies on observer's judgments of visual luminance by presenting eight half-tone patterns with luminances from 1/9 to 8/9 of the maximum value of each colour channel. These half-tone patterns were distributed over the screen both over the vertical and horizontal viewing axes. We conducted a series of photometric and psychophysical measurements (consisting in the simultaneous presentation of half-tone patterns in each trial) to evaluate whether the angular dependency of the light generated by three different LCD technologies would bias the results of these gamma transfer function estimations. Our results show that there are significant differences between the gamma transfer functions measured and produced by observers at different viewing angles. We suggest appropriate modifications to the Xiao et al. paradigm to counterbalance these artefacts which also have the advantage of shortening the amount of time spent in collecting the psychophysical measurements

    Multiresolution wavelet framework models brightness induction effects

    Get PDF
    A new multiresolution wavelet model is presented here, which accounts for brightness assimilation and contrast effects in a unified framework, and includes known psychophysical and physiological attributes of the primate visual system (such as spatial frequency channels, oriented receptive fields, contrast sensitivity function, contrast non-linearities, and a unified set of parameters). Like other low-level models, such as the ODOG model [Blakeslee, B., & McCourt, M. E. (1999). A multiscale spatial filtering account of the white effect, simultaneous brightness contrast and grating induction. Vision Research, 39, 4361-4377], this formulation reproduces visual effects such as simultaneous contrast, the White effect, grating induction, the Todorovi膰 effect, Mach bands, the Chevreul effect and the Adelson-Logvinenko tile effects, but it also reproduces other previously unexplained effects such as the dungeon illusion, all using a single set of parameters

    Using two-alternative force choice tasks and Thurstone's law of comparative judgments for code-switching research

    Get PDF
    This article argues that 2-alternative forced choice tasks and Thurstone's law of comparative judgments (Thurstone, 1927) are well suited to investigate code-switching competence by means of acceptability judgments. We compare this method with commonly used Likert scale judgments and find that the 2-alternative forced choice task provides granular details that remain invisible in a Likert scale experiment. In order to compare and contrast both methods, we examined the syntactic phenomenon usually referred to as the Adjacency Condition (AC) (apud Stowell, 1981), which imposes a condition of adjacency between verb and object. Our interest in the AC comes from the fact that it is a subtle feature of English grammar which is absent in Spanish, and this provides an excellent springboard to create minimal code-switched pairs that allow us to formulate a clear research question that can be tested using both methods

    Low-level spatiochromatic grouping for saliency estimation

    Get PDF
    We propose a saliency model termed SIM (saliency by induction mechanisms), which is based on a low-level spatiochromatic model that has successfully predicted chromatic induction phenomena. In so doing, we hypothesize that the low-level visual mechanisms that enhance or suppress image detail are also responsible for making some image regions more salient. Moreover, SIM adds geometrical grouplets to enhance complex low-level features such as corners, and suppress relatively simpler features such as edges. Since our model has been fitted on psychophysical chromatic induction data, it is largely nonparametric. SIM outperforms state-of-the-art methods in predicting eye fixations on two datasets and using two metrics

    Saliency estimation using a non-parametric low-level vision model

    Get PDF
    Many successful models for predicting attention in a scene involve three main steps: convolution with a set of filters, a center-surround mechanism and spatial pooling to construct a saliency map. However, integrating spatial information and justifying the choice of various parameter values remain open problems. In this paper we show that an efficient model of color appearance in human vision, which contains a principled selection of parameters as well as an innate spatial pooling mechanism, can be generalized to obtain a saliency model that outperforms state-of-the-art models. Scale integration is achieved by an inverse wavelet transform over the set of scale-weighted center-surround responses. The scale-weighting function (termed ECSF) has been optimized to better replicate psychophysical data on color appearance, and the appropriate sizes of the center-surround inhibition windows have been determined by training a Gaussian Mixture Model on eye-fixation data, thus avoiding ad-hoc parameter selection. Additionally, we conclude that the extension of a color appearance model to saliency estimation adds to the evidence for a common low-level visual front-end for different visual tasks
    corecore